---------------------------

REPLICATION PACKAGE FOR "AGGREGATION AND THE ESTIMATION OF QUALITY CHANGE: APPLICATION TO U.S. IMPORT PRICES" BY MARCO ERRICO AND DANIAL LASHKARI (QJE)

To run the entire replication package, simply open master.m, and specify the path of the folder on your computer, to path to your STATA application, and a binary variable to indicate if you are running this on batch or not.

Each of the following 4 sections runs independently of one another, so you can run only a specific section / sections if you would like.

----------

DIRECTORY OVERVIEW: 

-Imports (application to the price index of US imports, corresponds to Section 4 and Appendix D)
-Auto (validation on US auto data, corresponds to Section 5 and Appendix E)
-FBW (comparison of our estimates to Broda-Weinstein estimates at detailed product level, corresponds to parts of Appendix D)
-Monte Carlo (Monte Carlo simulation, corresponds to Appendix C)

See the later sections for more detailed directory information about each of these folders 

----------

REQUIREMENTS:

-This was ran with 50 GB of memory, 32 parallel workers, and 10 GB of memory for each parallel worker 

Stata:
-This was ran using Stata 18 MP/SE

-We used the following updated packages from ssc as of February 2025: ivreg2, ranktest, gtools, mat2txt, estout, winsor2, distinct, coefplot, binscatter  
-We also used netinstall to download the following packages: 
	-net install binsreg, from(https://raw.githubusercontent.com/nppackages/binsreg/master/stata) 
	-net install binscatter2, from("https://raw.githubusercontent.com/mdroste/stata-binscatter2/master/")
	-net install ivreghdfe, from(https://raw.githubusercontent.com/sergiocorreia/ivreghdfe/master/src/) *
	-net install ftools, from(https://raw.githubusercontent.com/sergiocorreia/ivreghdfe/master/src/) *
	-net install reghdfe, from(https://raw.githubusercontent.com/sergiocorreia/ivreghdfe/master/src/) *

* Make sure you are using net install from this website for these packages, and not ssc install, or else the code won't work. You can run "cap ado uninstall" for said packages before to be sure 

Matlab:
-This was ran using Matlab R2023b

-We used the following toolboxes: Statistics, Statistics and Machine Learning, Parallelization

-We also use the folder PanelDataMATLAB-master, which is from http://www.paneldatatoolbox.com 
(Álvarez, Inmaculada C.; Barbero, Javier and Zofío, José L, (2017) A Panel Data Toolbox for MATLAB. Journal of Statistical Software. Volume 76, Issue 6, pp 1-27. http://dx.doi.org/10.18637/jss.v076.i06)

Latex:
-Insert usepackage{booktabs} in  your document preamble to compile the tables as we intended 

----------

TIME TO RUN * :
-Imports (2 hours)
-Auto, Kimball (1 hour)
-Auto, GMY Replication (1 hour)
-Auto, GMY Replication Mixed CES (5 hours)
-FBW (A few minutes)
-Monte Carlo (6 hours)

* Times are approximate, and this was just on our computer. Times may vary when run on different computers / environments 



---------------------------------------



IMPORTS:

-raw_data 	- see below for what raw data we use and where obtained it was from
-temp_data 	- temporary files used while creating our final data / estimates (initially empty)
-cleaned_data 	- final data / estimates that we run analysis on (initially empty)
-cleaning_scripts 	- more basic cleaning scripts 
-algorithm_scripts 	- scripts where we run some type of estimation
-analysis_scripts 	- scripts where figures / tables are created 
-tables 	- see below for which tables are created in which scripts
-figures 	- see below for which figures are created in which scripts


Raw Data:

nberces5818v1_n2012.dta - NBER-CES Manufacturing Industry Database
https://www.nber.org/research/data/nber-ces-manufacturing-industry-database
Go to the link above, scroll down under "Data and Supporting", and click the link that says "Stata" to the right of "2012 NAICS Version," and the data will download. Accurate as of 10/21/2024, confirmed correct dataset. More details on the construction of this dataset can be found here: https://data.nber.org//nberces/nberces5818v1/nberces5818v1_technical_notes_Mar2021.pdf

BEA-BLS-industry-level-production-account-1987-2021.xlsx
https://www.bea.gov/data/special-topics/integrated-industry-level-production-account-klems
Click on "Data," then "Production Account Tables, 1987-2021," and the data will download. Or, go to BEA website, click on the "Data" header tab, then "Data by Topic," then "Special Topics" on the left, then scroll down and click "Integrated Industry-Level Production Account (KLEMS)," and that will take you to the link above and you can then follow the two steps listed initially. Accurate as of 10/21/2024. 

GrossOutput.xlsx
https://www.bea.gov/
* If you download the current version of this data, it's different than the archived one for some reason. So the steps below will show you how to download the archived one. Go to the link above. Click the "Data" header, then "By topic." Then click "GDP by Industry." Scroll down and click "Previously Published Estimates," then "Data Archive" under the drop down. Then click "Gross Domestic Product by Industry and Input-Output Statistics," then click "2022, Q2" to download the data. Then click "Underlying detail" to download the data. It will download a folder called "UGdpxByInd," and we will use "GrossOutput.xlsx" within that. Accurate as of 11/22/24.

To download the current data (not the one used in the paper), go to https://www.bea.gov/itable/gdp-by-industry. Under the square/rectangular block, click the orange bullet that says "Underlying Detail Tables, Annual, 1997-2022." Then click the link that says "view detail level tables and other bulet download files in XLSX format." Then scroll down and underneath "Underlying detail," click "Gross Output by Industry," and the data will download. Accurate as of 10/21/2024.

GDPbyInd_GO_1947-1997.xlsx
https://www.bea.gov/itable/gdp-by-industry
Go to the link above. Under the square block, click the orange bullet that says "Historical 1947-1997 Data." Then click the "Annual Tables" link which is to the right of "Download historical GDP by industry ZIP file:," and the data will download. The file we want (title at the start of this paragraph) will be within the ZIP file that downloads called "AllTablesHist." Accurate as of 10/21/2024. 

raw_data_1989_2019_export - Folder with all of the export data
https://sompks4.github.io/
Go to the link above, click on "Data," click on "imports and exports" to the right of "Files:" under "2. US HS-level imports and exports (1989-2023)." Then download all the "imp-" ZIP files from 89 to 119, and use the .dta file in each ZIP file. Accurate as of 10/22/2024.

raw_data_1989_2019 - Folder with all of the import data
https://sompks4.github.io/
Same process as above, except use the "exp-" files instead. 

hts_concordances_20190712_198906_201901.dta - Peter Schott's consistent HS category concordance 
https://sompks4.github.io/
Go to the link above, click "Data," then click "updated import concordance archive (v2019.7.12)" to the right of "Files" under "3. US HS over time concordances." Inside the ZIP file on Dropbox you will find the .dta file. Accurate as of 10/22/2024.

temp_hts10_naics5_concordance_exports.dta / temp_hts10_naics5_concordance_imports.dta - Amiti-Heise NAICS-HS concordance
https://www.sebastianheise.com/
Go to the link above, click on the "Research" header, click on the "HTS10-NAICS Concordances" link under "US Market Concentration and Import Competition," and the files will be in the downloaded ZIP file. Accurate as of 10/22/2024.

PCEPI.xlsx - PCE data from FRED - Personal Consumption Expenditures: Chain-type Price Index, Index 1989=100, Annual, Seasonally Adjusted
https://fred.stlouisfed.org/
Go to the link above, which is the FRED website. In the search bar, search for "Personal Consumption Expenditures: Chain-type Price Index," and click the first thing which pops up, which should have the same name. Change the starting date of the graph to January 1989 and end date to January 2018. Click "Edit Graph," then change the frequency to be "Annual." In "Units," change the index to 1989-01-01. Then click download, Excel. Accurate as of 11/15/2024.

IR.xlsx - IPI data from FRED - Import Price Index (End Use): All Commodities, Index 1989=100, Annual, Not Seasonally Adjusted
https://fred.stlouisfed.org/
Go to the link above, type "Import Price Index (End Use): All Commodities" into the search bar, and click the first link, which should have the same name. Change the starting date of the graph to January 1989 and end date to January 2018. Click "Edit Graph," then change the frequency to be "Annual." In "Units," change the index to 1989-01-01. Then click download, Excel. Accurate as of 11/15/2024.

IM752.xlsx - IPI data for Computer Equipment pre-2005
https://fred.stlouisfed.org/
Go to the link above, type "IM752" into the search bar, and click on "Import Price Index (SITC): Computer equipment (DISCONTINUED)," which should pop up. Click "Edit Graph," then "Units," and then change the index to be 100 at 2005-12-01. Then click download, Excel. Accurate as of 11/15/2024. 

IZ3341.xlsx - IPI data for Computer Equipment post-2005
https://fred.stlouisfed.org/
Go to the link above, type "IZ3341" in the search bar, and click on "Import Price Index (NAICS): Computer and Peripheral Equipment Manufacturing," which should pop up. Change the end date to December 2018. Then click download, Excel. Accurate as of 11/15/2024. 

IPUEN3341T050000000.xlsx - PPI data for Computer Equipment
https://fred.stlouisfed.org/
Go to the link above, type "Sectoral Output Price Deflator for Manufacturing: Computer and Peripheral Equipment Manufacturing (NAICS 3341) in the United States" in the search bar. Click the drop-down arrow that says "1 other format," and click "Annual, Index 2017 = 100, Not Seasonally Adjusted." Click "Edit Graph" and click "Units" and change the index to be 100 at 1995-01-01. Then change the Start date to 1995-01-01 and the end to 2018-01-01. Then click Excel, and download. Accurate as of 11/15/2024. 

IOUse_Before_Redefinitions_PRO_1963-1996_Summary - Input-Output Tables 1963-1996
https://www.bea.gov/
Go to the link above, type "Input Output Tables" in the search bar, and click the second thing that pops up, which should say "Input-Output Accounts Data." Scroll down and click "Historical Make-Use Tables," then click "1963-1996: 65 Industries" under "Make Tables/Before Redefinitions." Accurate as of 11/22/24.

IOUse_Before_Redefinitions_PRO_1997-2021_Summary 
https://www.bea.gov/
Go to the link above, go to "Tools" then "Interactive Data." Click "Input-Output" on the left, then the orange "Interactive Data Tables." Then click "view detail level tables and other bulk download files in XLSX format." Then under "Make-Use," click "All Tables," and the downloaded folder should have the file we are looking for. Accurate as of 11/22/24.

Reexport-WebProduct-2015-2021.xlsx  
https://www.bing.com/ck/a?!&&p=f61dd480ef594e570e5e62c174b5dba1fb2f142907c3fd3ac0daa934b6c25b47JmltdHM9MTc0MDAwOTYwMA&ptn=3&ver=2&hsh=4&fclid=32233a4a-fac6-6da1-313b-2ee5fb856c6b&psq=reexports+2015-2021+BEA&u=a1aHR0cHM6Ly93d3cuYmVhLmdvdi9zeXN0ZW0vZmlsZXMvMjAyMS0wMi9yZWV4cG9ydHMtMjAxNS0yMDE5Lnhsc3g&ntb=1 
/ 
https://view.officeapps.live.com/op/view.aspx?src=https%3A%2F%2Fwww.bea.gov%2Fsites%2Fdefault%2Ffiles%2F2021-06%2Ftrad1321.xlsx&wdOrigin=BROWSELINK


Figure / Table creation:

0_data_sanitycheck
-Figure D.1 

4_PostEstimation.do
-Table D.5  
-Table 1
-Table D.6 
-Figure 1 
-Figure 2, Center 
-Figure D.11, Center 
-Figure 2, Left 
-Figure D.11, Left 
-Figure 2, Right 
-Figure D.11, Right
-Figure D.9

4_1_PostEstimation.do
-Figure E.2, Right 

5_Welfare_Decomposition_Variety.do
-Table 2 
-Figure 3, Left  
-Figure D.5, Left

6_Welfare_Decomposition.do
-Table D.7 
-Figure D.7  
-Figure D.10 

7_Quality_Decomposition.do
-Figure D.6, Left 
-Figure D.12 
-Figure D.6, Right
-Table D.4  

7_Quality_Decomposition_Sector.do
-Figure D.5, Right 
-Figure 3, Right  

8_Kimball_CES_Comparison 
-Figure D.4 


----------

AUTO: 

-raw_data 	- see below for what raw data we use and where obtained it was from
-temp_data 	- temporary files used while creating our final data / estimates (initially empty)
-cleaned_data 	- final data / estimates that we run analysis on (initially empty)
-cleaning_scripts 	- more basic cleaning scripts 
-algorithm_scripts/GMY Replication 	- Replication package of "Evolution of Market Power in the US Auto Industry" by Paul Grieco, Charlie Murry, and Ali Yurukoglu." Small modifications to their code are listed in main_est.m
-algorithm_scripts/GMY Replication_Mixed_CES 	- Same replication package as above, but larger modifications to adapt for Mixed CES. Modifications to their code are listed in main_Mixed_CES.m
-algorithm_scripts/C_KimballCar 	- Our estimation code 
-analysis_scripts 	- scripts where figures / tables are created 
-tables 	- see below for which tables are created in which scripts
-figures 	- see below for which figures are created in which scripts


Raw Data:

These following files are from the GMY Replication package, though some of them include revisions
-data_blp.csv (This is the same one as in the GMY Replication folder)
-data_blp_raw.dta
-data_blp_original.dta 

-CPI_Auto

-PCE.csv 

Total Personal Consumption Expenditure:
Download: https://fred.stlouisfed.org/series/PCE
Deflate to 2015 dollars

Per capita PCE:
https://fred.stlouisfed.org/series/A794RC0Q052SBEA

PCE price indices:
https://fred.stlouisfed.org/series/AB67RG3Q086SBEA
https://fred.stlouisfed.org/series/PCEPI

PCE Expenditure on new vehicles:
https://fred.stlouisfed.org/series/A136RC1Q027SBEA

-PCE_outside.csv (found in the temp folder and GMY_Replication data folder)


The following data in the GMY_Replication data folder (and GMY_Replication_Mixed_CES data folder) is from the GMY replication package:
-CPS_households.csv
-CPIU_1967base.csv
-income_quintile_cutoffs_by_year.csv
-make_group_ids.csv 
-microMomentsCEX
-microMomentsMRI
-secondChoice_bootstrap.mat
-secondChoiceOut.mat 
-data_blp.csv


Figure / Table creation:

A_Data_SummaryStats.do 
-Table E.1 

G_Postestimation.do 
-Table E.4

E_PCA_characteristics_shares.do
-Figure E.5, left 
-Figure E.5, right 
-Table E.5  
-Table E.6 

G_1_Postestimation.do 
-Table 3 
-Figure 4, Left Panel 
-Figure 4, Right Panel 
-Figure E.10, Left 
-Figure E.10, Center 
-Figure E.10, Right 
-Figure E.7, Right 
-Figure E.7, Left 
-Figure E.8, Right
-Figure E.8, Left 
-Figure E.9, Left  
-Figure E.9, Center 
-Figure E.9, Right 
-Figure E.3 
-Table E.2 
-Table E.3
-Figure E.4, Left 
-Figure E.4, Right  

G_2_Postestimation.do
-Figure E.1, Left 
-Figure E.1, Right 
-Figure E.2, Left  

H_1_Welfare.do
-Table 4 
-Figure 5, Left  
-Figure 5, Right 

----------

FBW:

-raw_data 	- see below for what raw data we use and where obtained it was from
-cleaned_data 	- modified data that we use for our analysis (initially empty)
-estimates	- our various estimates 
-scripts 	- scripts where figures / tables are created 
-tables 	- see below for which tables are created in which scripts
-figures 	- see below for which figures are created in which scripts


Raw Data:
-rauch_sitc.dta 
-SITC3 to SITC2 Conversion and Correlation Tables 


Figure / Table creation:

PostEstimation.do
-Table D.1
-Figure D.8
-Table D.2

Rauch.do 
-Table D.3
-Figure D.2, Left
-Figure D.2, Right 
-Figure D.3

----------

MONTE CARLO: 

-estimates	- estimates we get from Monte Carlo simulation
-scripts 	- scripts where data is generated, Monte Carlo is run, and figures are created  
-figures 	- see below for which figures are created in which scripts


Figure / Table creation:

PostAnalysis.m
-Figure C.1
-Figure C.2













